-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
👌 IMPROVE: Add SQLA migration for parity with Django schema #5097
Conversation
a8b9da7
to
3831363
Compare
3831363
to
35621dc
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## develop #5097 +/- ##
===========================================
+ Coverage 81.90% 82.01% +0.11%
===========================================
Files 530 533 +3
Lines 38009 38230 +221
===========================================
+ Hits 31128 31350 +222
+ Misses 6881 6880 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Ok this is coming along now. One thing to note is that django auto-generates names for indexes that have these partial-uuids as part of the name, e.g. For parity, we want these names to be exactly the same so, assuming there is no possible django migration to change the names, I guess the migration here should rename all indexes. Note, alembic does not allow for an actual rename operation: sqlalchemy/alembic#246, so we would need to drop the indexes then recreate them |
Analysis of previous unique constraints: $ verdi -p a-import-django devel run-sql "SELECT tbl.relname,c.conname,c.contype,c.conkey FROM pg_constraint AS c INNER JOIN pg_class AS tbl ON tbl.oid = c.conrelid WHERE c.contype='u' ORDER BY tbl.relname,c.conname;"
('db_dbauthinfo', 'db_dbauthinfo_aiidauser_id_dbcomputer_id_777cdaa8_uniq', 'u', [5, 6])
('db_dbcomment', 'db_dbcomment_uuid_49bac08c_uniq', 'u', [2])
('db_dbcomputer', 'db_dbcomputer_name_key', 'u', [3])
('db_dbcomputer', 'db_dbcomputer_uuid_f35defa6_uniq', 'u', [2])
('db_dbgroup', 'db_dbgroup_name_type_12656f33_uniq', 'u', [3, 4])
('db_dbgroup', 'db_dbgroup_uuid_af896177_uniq', 'u', [2])
('db_dbgroup_dbnodes', 'db_dbgroup_dbnodes_dbgroup_id_dbnode_id_eee23cce_uniq', 'u', [2, 3])
('db_dblog', 'db_dblog_uuid_9cf77df3_uniq', 'u', [10])
('db_dbnode', 'db_dbnode_uuid_62e0bf98_uniq', 'u', [2])
('db_dbsetting', 'db_dbsetting_key_1b84beb4_uniq', 'u', [2])
('db_dbuser', 'db_dbuser_email_30150b7e_uniq', 'u', [5])
$ verdi -p a-import-sqla devel run-sql "SELECT tbl.relname,c.conname,c.contype,c.conkey FROM pg_constraint AS c INNER JOIN pg_class AS tbl ON tbl.oid = c.conrelid WHERE c.contype='u' ORDER BY tbl.relname,c.conname;"
('db_dbauthinfo', 'db_dbauthinfo_aiidauser_id_dbcomputer_id_key', 'u', [2, 3])
('db_dbcomment', 'db_dbcomment_uuid_key', 'u', [2])
('db_dbcomputer', 'db_dbcomputer_label_key', 'u', [3])
('db_dbcomputer', 'db_dbcomputer_uuid_key', 'u', [2])
('db_dbgroup', 'db_dbgroup_label_type_string_key', 'u', [3, 4])
('db_dbgroup', 'db_dbgroup_uuid_key', 'u', [2])
('db_dbgroup_dbnodes', 'db_dbgroup_dbnodes_dbgroup_id_dbnode_id_key', 'u', [3, 2])
('db_dblog', 'db_dblog_uuid_key', 'u', [10])
('db_dbnode', 'db_dbnode_uuid_key', 'u', [2])
('db_dbsetting', 'db_dbsetting_key_key', 'u', [2]) |
Note it is non possible to set database level defaults in django (https://code.djangoproject.com/ticket/470), so I am removing the |
another issue with django, it didn't rename the indexes & constraints when applying |
In aiidateam#3582, SQLA migrations were broken, by starting a top-level transaction, instead of letting alembic open per-migration transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have given the code a look now. Will now to try to run the migration locally and test things a bit.
aiida/backends/sqlalchemy/migrations/versions/91b573400be5_prepare_schema_reset.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b16_django_parity_1.py
Outdated
Show resolved
Hide resolved
sa.Column('type', sa.String(255)), | ||
) | ||
|
||
op.execute(db_dblink.update().where(db_dblink.c.type.is_(None)).values(type='')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already commented on the default setting for the Comment.ctime
, but this thing seems way more serious. I wonder if we should collect all default settings that would be problematic if there are really rows out there that have null values. We should then have a "integrity-check" that finds these. I fear that if we just silently set these missing values to a default, the database will be broken at other points. Having a Link
without a type is bad...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I imagine having them set to None
would already be bad, and so going from None -> ''
is not going to break the database any more than it was already.
I think in reality the python code should have already stopped this from occurring, but yeh having an integrity-check could be done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well maybe it would actually be better to have it fail in this case. Assuming that these cases are rare or not existing, we could analyze the problem and provide a proper fix. Tecnhically we would be able to deduce the correct type based on input and output nodes. I think this column is too delicate to be changing silently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so if I just remove this line then.
Is there any other you think should be definitely removed?
I guess we can think to provide a "fix" under verdi storage
for such issues,
but I imagine we can defer that to another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and any that do get removed, I can add a comment on in the docstring)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, essentially there are these two comments remaining: (1) remove the setting of defaults that should really be considered as grave bugs (such as this one. Please also check if there are others), and (2) if you keep the commented out lines for operations that should have been there in principle but can't due to problems in original schema, add a generic comment in top docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I have removed this default setting, and added a comment to the docstring. I didn't find any other's that would corrupt the database
aiida/backends/sqlalchemy/migrations/versions/1de112340b16_django_parity_1.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b18_django_parity_3.py
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b18_django_parity_3.py
Show resolved
Hide resolved
O.M.G. So after here:
and here:
(at least) you end up within an "idle" transaction: $ ps x | grep postgres
...
12539 ?? Ss 0:00.01 postgres: aiida_qs_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 test-sqla9_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 ::1(51834) idle in transaction when alembic then tries to alter the table during the migration:
it hangs waiting for this idle transaction to close 7434 ?? Ss 0:00.01 postgres: aiida_qs_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 test-sqla9_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 ::1(50629) idle in transaction
7435 ?? Ss 0:00.01 postgres: aiida_qs_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 test-sqla9_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 ::1(50630) idle
7436 ?? Ss 0:00.04 postgres: aiida_qs_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 test-sqla9_chrisjsewell_3abef2c96e126c712c1d3e256a3590b9 ::1(50631) ALTER TABLE waiting I don't know why this idle transaction opens, and cannot think how to stop it from opening. transaction = get_scoped_session().get_transaction()
if transaction:
transaction.close() before starting the migration, this closes any current transaction, allowing for the migration to proceed properly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @chrisjsewell I think we can almost merge this. I closed all resolved comments, but there are still a couple outstanding
aiida/backends/sqlalchemy/migrations/versions/91b573400be5_prepare_schema_reset.py
Outdated
Show resolved
Hide resolved
Ok trying reverting in |
Ok I just tested a "manual" migration, and all seems to be running as expected now 🎉 |
Co-authored-by: Sebastiaan Huber <[email protected]>
Co-authored-by: Sebastiaan Huber <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, just found a few more things, some minor, but some very important. I made them all as suggestions, so I think you can just all batch them in one go and we can merge. Think you will agree with all of them
aiida/backends/sqlalchemy/migrations/versions/1de112340b18_django_parity_3.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b17_django_parity_2.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b16_django_parity_1.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b17_django_parity_2.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b18_django_parity_3.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b16_django_parity_1.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b16_django_parity_1.py
Outdated
Show resolved
Hide resolved
aiida/backends/sqlalchemy/migrations/versions/1de112340b16_django_parity_1.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Sebastiaan Huber <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @chrisjsewell let's get this show on the road
for more information, see https://pre-commit.ci
cheers for the review! |
This PR adds migrations to the SQLAlchemy schema, to make it identical to the Django schema, and adds tests to ensure this parity.
There are three migrations:
NOT NULL
restriction, are indeed non-nullnullable=False
db_dblog.levelname
:String(255) -> String(50)
db_dbsetting.key
:String(255) -> String(1024)
db_dbsetting.description
:String(255) -> Text()
tests/backends/test_schema_parity.py
then tests the parity of the schema columns (name, type, nullable), primary key constraints, unique constraints and indexes.This also required a regression fix for #3582, which had actually broken the sqlalchemy migration mechanism: instead of carrying out each migration in a separate transaction, this had caused all migrations to be carried out in the same transaction. This caused
cannot ALTER TABLE "db_xxx" because it has pending trigger events
errors, where changes in one migration were not "flushed" to the database on a transaction close.In renaming indexes/constraints the sqlalchemy models now adopt the auto-generated names from django (see https://github.com/django/django/blob/2f33217ea2cad688040dd6044cdda946c62e5b65/django/db/backends/base/schema.py#L989).
In order to control the naming, all indexes and constraints are now specified specifically in the sqlalchemy models (in
__table_args__
), as opposed to being auto-generated via theindex
andunique
options on the relevantColumn
.It is also of note, that due to not taking into account https://code.djangoproject.com/ticket/23577, some of the django migrations were not properly written to rename index/constraint names, and so some names do not actually mirror their column names, for example
db_dbgroup_name_...
instead ofdb_dbgroup_label_...
.Finally, the database default for
repository_metadata
was removed, since this is not possible to set in django (see https://code.djangoproject.com/ticket/470), and theDbNode.reposotry_metadata
properties have been modified, to ensure they return an empty dict, rather thanNone
.closes #2303, and a step towards addressing #4985